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SUMMARY 

Three multivariate techniques were considered to evaluate different types and 
grades of tobacco using the results of nine chemical quality parameters. 

Samples of tobacco already graded according to conventional criteria were 
submitted to a discriminant analysis to evaluate the operativity of the grading 
systems. Through Mahalanobis distances an order of similarity between types and 
grades was deduced, thus enabling the search of compatible tobaccos with similar 
chemical properties. By principal component analysis the system of nine variables 
was reduced to three components, in such a way, the result of the discriminant 
analysis was reproduced at 9495. A data file from different kinds of tobacco was 
analyzed; through a non-hierarchical cluster method, which proved the great 
capacity of this technique to gather objects of similar chemical nature in 
homogeneous groups, and its advantage in the development of better grading 
criteria. 

The application of these techniques involving physical and organolleptieal 
parameters in addition to the chemical ones, will provide a powerful tool for 
future studies in this fields 


RESUME 

On a employe trois techniques multivariantes pour ^valuer les diff6rents types 
et degr^s de tabacs en utilisant les resultats de neuf parambtres chimiques de 
quality. 

En partant d’echantillons classifies au pr6alable selon des critbres physiques 
conventionnels, on a fait usage d’une analyse discriminante pour ^valuer 
l*op6rativit6 des systbmes de classification. En utilisant les distances de 
Mahalanobis, on a pu deduire I’ordre de similarity entre les divers types et 
degr6s permettant la recherche de tabacs compatibles selon leurs propriety 
chimiques. En partant des composantes principales, le systbme de neuf variables 
a pu etre r6duit a trois variables, ainsi, les resultats de I’analyse discriminatoire 
ont ete reproduits 4 9495. Une methode cluster non hi6rarchique a ete appliqube 
£ plusieurs echantillons de differentes classes de tabacs ayant des propri£t4s 
chimiques similaires. Cette technique s’est montree d’un grand potentiel pour lb 
developpement de meilleurs critbres de classification. 

L’application de ces methodes, qui incluent des parambtres organoleptiques et 
physiques en plus des paramdtres chimiques, pourrait devenir un outil puissant 
pour des recherches posterieures. 
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INTRODUCTION 


The multivariate analysis is a statistical technique for data analysis that allows 
to work simultaneously with several variables, giving the possibility of an integral 
analysis of the information. Among the statistical techniques of this type, we can 
mention the Discriminant Analysis, the Principal Component Analysis (PC) and the 
Cluster Analysis. 

The discriminant analysis is a supervised technique that starts from classified 
observations to determine whether or not an object belongs to an assigned 
population. The objective of the principal component analysis is to reduce the 
original variables to a few components without a significant loss of information; 

The cluster analysis is a non-supervised classification technique which allows 
to group the objects according to some similarity criterion. 

In.this study, the foregoing techniques were applied for analysis of the tobacco 
grading system, the comparison between crops and the definition of equivalences 
between different grades of tobacco. A non-hierarchical cluster analysis was 
carried out with samples of different varieties so as to gather sample groups 
having similar chemical characteristics. 

EXPERIMENTAL 

a. Sample Identification 

For the evaluation of the grading system, seven grades of the 1988 Flue-cured 
tobacco crop were classified according to conventional criteria (color, body, size, 
stalk position, etc.). The comparison of crops was made between 1986, 1987 and 
1988 Flue-cured tobacco crops (Table 1). To group tobacco samples according to 
their chemical similarities, samples of Flue-cured, Burley, Black air-cured and 
another type of air-cured called Extra, were employed (Table 2). 


Table 1. Sample identification. Seven TabU 2. Saaple Identification 

Flue Cured Grades (A-G). 1386, 1387, (ID) for Ion-Hierarchical Cluster 

1988 Crops Analysis 


1986 

a 

1987 n 

1988 

B 

Description 

ID 

a 

Description 

A6 

5 

A7, 

9 

A8 

9 

Outters 

H 

10 

leaf, Extra air-cured 

86 

5 

87 8 

88 

8 

leaf 

I 

10 

Leaf,, Black air-cured 

C6 

8 

01 

9 

C8 

9 

tugs 

J 

to 

leaf, Flue-cured 

06 

6 

07 

9 

08 

9 

Smoking leaf 

K 

9 

Stem,, Flue-cured 

E6 

6 

E7 

9 

E8 

9 

Primings 

l 

10 

leaf, Burley 

F;6 

1 

F7 

9 

F8 

9 

Upper leaf 




G6 

5. 

G7 

9 

G8 

9 

Kixed-non descript 





n; nwber of saaples analyzed n: number of samples analyzed 


b. Chemical Quality Parameters 

The parameters used as quality criteria were: Reducing Sugars, Total Nitrogen, 
Nicotine, Total Volatile Bases (TVB), Petroleum Ether Extracts (PEE), Ashes, 
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Chlbrides and: Sugar-Nicotine (Sug-Nic), Nitrogen-Nicotine (Nit-Nic) ratios. The 
chemical analyses were carried out according to CORESTA, ISO and other standard 
methods used for tobacco. 

c. Data Analysis 

The following computer software were utilized for the statistical analysis of the 
information: Systat 4.1, Statgraphics 4.0, Minitab 6.1, Lotus 2.2, Asym v and 

Harvard Graphics 2.13. 

RESULTS AND DISCUSSION 


a. Evaluation of Tobacco Grading Systems 


This evaluation, was made using discriminant analysis applied to seven grades 
of the 1988 Flue-cured tobacco crop. A preliminary variance analysis showed that 
the nine chemical parameters considered had a discriminant capability. The 
criteria for the classification were Mahalanobis’ minimum distances (D 2 iz ), 
according to the following definition: 


Classify Z into ni; if: 
D ? i2 = min {D 2 1z , D 2 2z , ..... 



where i = 1,.,.,., 7, z-= 1, 




62 


The results are shown in Table 3, where the "total” row represent the number 
of pre-classified samples in each group according to conventional criteria, and 
the columns contain the number of samples classified by the discriminant analysis. 
Grade A is taken as an example: eight out of nine samples were classified as A 
and one was re-classified as C. Four samples of the 62 analyzed were re¬ 
classified in different grades, indicating that the discriminant analysis reproduced: 
the conventional grading at 93.5%. 


Tible 3. Discriiiaaat Aaalysis (DA) Table 4. Principal Coaponent 

1188 Crop. Nine Yiriibles by Gride Am lysis (PC). Cooponent Loadings 


Grade 

Oiscriiiaaot Analysis 

A 

A B C 0. E F G 

8 1 

8 

8 1 

C 

1 8 1 

0 

8 

£ 

8 

F 

9 

G 

9 


Variable 

PCI 

PC2 

PC3 

Sugar 

-0.96 

0.14 

-0.03 

nitrogen. 

0.83 

0.08 

0.53 

nicotine 

0.14 

0.98 

0.05 

PEE 

0:60 

0.18 

-0.69 ! 

TVB 

0.87 

0.26 

0.26 

Ashes 

0.65 

-0.46 

-0.51 

Chlorides 

0.45 

-0:14 

-0.26 

Sug-Hic 

-0.90 

-0.35 

0.01 

Nit-Nic 

0.47 

-0.18 

0.31 


Total 989999 9 - 

Classified by DA. 8 8 8 8 8 9 9 Variance (») 48.95 22.73 14.35 
HiscUssified 10 1110 0 Variance Accuaulate (1) 48.95 T1.68 88.03 
Reproduced (»): 89 100 89 89 89 100 100 - 
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These same samples were analyzed by the Principal Component method (PC) 
using the correlation matrix to avoid the effect of differences in the units of 
measure (Table 4). The loading vectors of the components represent the 
correlation between each PC and the variables analyzed. PCI includes information 
about nearly 50% of the system’s total variance, representing mainly the variables 
Reducing Sugars, Nitrogen* TVB, Ashes, Chlorides and the Sug-Nic ratio; while 
PC2, dominated by Nicotine and the Nit-Nic ratio* represents almost 23% of the 
total. Thus, the nine parameters can be reduced to three new variables (PCI, 
PC2, PC3) involving 86% of the information contained in the original variables, 
significantly reducing the dimensions of the population and allowing a 
comprehensive analysis of the information. The discriminant analysis shown in 
table 3 was reproduced at 93.6% when the three PC were used (Table 5); an 
analysis with four PC gave the same results as with three. 


Table 5. Oiscriiioaat Analysis (DA) 
1988 Crop. Three Priocipal Coipooeats 


Grade 

Oiscriiiaaot Analysis 




A 

B 

c 

D 

E 

F 

G * 

A 

8 


2: 





B 


8 


1 



.. 

c 

1 


1 


\ 



0 

E 




8 

8 

1 

t 

F 






8 


G 







8 

Total 

9 

8 

9 

9 

9 

9 

9 

Classified by OA 
Kisclassified 

8 

1 

8 

0: 

1 

2 

8 

1 

8 

1 

8 

1 

8 

1 

Reproduced (1) 

89 

100 

?8 

89 

89 

89 

89 


Using PCI and PC2, a dispersion diagram was made for the grades considered 
(Fig. 1); grades F and G showed the highest variability; this fact was confirmed 
with the results of the generalized variance. The face system developed by 
Riedwyl and Flury in 1986, where each element of the face represents a different 
variable (Fig. 2), was used with the mean vectors of each grade in their first two 
PC, and also with the mean vectors of the nine original variables (Fig. 3). The 
differences or similarities between grades (higher or lesser distance between 
faces) can be observed, as well as the differences between variables in each 
grade (differences in face’s elements). 

b. Comparison of Crops 

Seven grades from each of the 1986, 1987 and 1988 Flue-cured crops were 
compared among them by means of their PC. Figure 4 shows how the grades D, 
E, F, G form separate groups, indicating a good correlation within a group for 
different crops. Grades B and C are quite near and grade A had the highest 
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PRINCIPAL COMPONENT ANALYSIS 


DI8PERBION ANALYSIS 

FLUE-CURED TOBACCO GRADES - 1068 CROP 



PRINCIPAL COMPONENT 1 


ricunc 2. CHEMICAL VARIABLES AtPRC $LM UD BT 
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Reducing Sugar* 
SI**) 



Total MiLrogon 
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PIQURC S 


FIQURE 4 


PRINCIPAL COMPONENT ANALYSIS 


MEAN VEQTOR8 

PLUE-CUREO TOEACOO GRADES - till CROP 



PRINCIPAL COMPONENT ANALYSIS OF FLUE-CURED TOBACCO 

COMPARISON OF THE 8AME GRADES BETWEEN 1086, 1067 AND 1088 CROPS 
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variability in all three crops; grade A6 seems more similar to grade C than to 
grades A7 and A8. The mean vector of the chemical variables for each grade can 
be observed in Figure 5. 

A generalized Hotelling test was conducted on the Mahalanobis distances among 
the 21 grades (Table 6), using the information from the first three PC to 
determine chemical equivalence between grades. Table 7 shows the results for 
some grades. The grade pairs D6-D8, C6-C7 and C6-A7 were equivalent among 
them. 

The mean vectors of the three PC of each grade were analyzed with a 
hierarchical cluster method (Nearest Neighbor using Euclidian Distance) and the 
results obtained are shown on Figure 6. The groups formed by D; E, F, G 
confirm the consistency of these grades along the three crops. Groups A8, A6- 
C8-A7-C6-C7-B6, and B7-B8 show a deviation from the expected classification. A 
rigorous examination of these grades’ samples which show inconsistency could lead 
to a re-classification or to a simplification or change in the grading criteria. 


Ti»U «. HAAXljUXXIt BIVTAJK* 


c* pi u m ci »■ At c7 rr it 

At O.t 32.4 W .2 IT.* J.t t.l 3.1 3 *. T 35.1 I*. 

It 42 ;) 13.1 14 .) 3.4 >.* 1 .) 13 .• 3 *.t 14.1 II. 

Ct 31;3 IS.) 2.1 4.4 1.1 Z.t S .3 1 I .3 IM I. 

1 * 34 .* 1.3 It .4 *. 311.4 3 . 334.1 31 .*: M II. 
14 ):« 11.3 M It .4 S.) 11 .* ).X 1 . 424 .I 1 . 

ft n.l «.S 14.2 3.1 IS .2 S .3 21.4 2 *.l *.4 I. 

Ct 1.1 14.4 1.3 23.4 IS .3 22.4 14.4 1.2 13 .• S. 

*T 33.1 24 .* T.l t.l 1.2 3 .S >.* 34 . 42 Z.S I. 

If 3 T .2 13 .) 14 .* 4 .* 4.3 *.s 12 .) 34.2 13 .) 12 . 

CT 2 *.I 12.1 4.4 4 .* 1.4 1.4 4.4 34 .* 14 .# S. 

BT 4«.3 3.1 I 1 .T 1 i 3 14.3 t.t 31.2 34.4 2.4 14 . 

CT *.) 14.1 1 .* *.2 3.3 t .2 );* T.t 13 .# *. 

TT 24.4 *.4 2*.4 3 .) 11.4 l*.» 34 .* 21.4 *.* 

CT *.) 2 *. 2 *.2 24 .* 1 *.) 2 *. 2 24 .* *.* 

*4 21.4 34.1 4.2 11.1 M 1 M ».* 

It 32.2 1*.3 11.4 2.1 4.1 *.* 

Cl 22.2 14.2 3.1 *.4 #.* 

04 31 .) 3.1 13.4 *.* 

It 11 .* IS.) *.* 

rt 23 j 3 *.* 

Cl *.l 


or ci •) at c* rt it ot ct tt >t 

24.4 4 .* t.* 1.1 33 .* 3 *.r 14 .S 2 *.I 3 .* );4 t.l 

4.1 1 .) I.* 3.3 32 .* 12 ;) 21.4 4.1 2.1 *.» 

12:2 *.) 3.3 *.3 23 .* 14 .T II .1 1.3 *.l 

*.S S .1 1.1 11.4 24.1 2.4 22.3 4.1 

:■.) ii.* a*.* n.t t_* i*.4 *.* 

3.1 11.3 11.1 14 .* D.l *.* 

32.4 21-1 2 *. 2 24 * *.* 

14.1 1.4 4.4 *.l 

t.l 2.1 *.» 

*.t ».* 

*.« 


Tibia 7. Hotelling T? Geaeralyzed 
Moltivtriste Analysis of Variance 


Pairs 

D Z 

F. Calculated 

06-08 

Oi 18 

1.91 

08-C7 

0.13 

2.42 

C6-A7 

0:23 

2.38 

06-07 

0; 44 

4.68 

F7-F8 

0.36 

4.77 

B7.-88 

0.42 

5.02 

F8-F8 

0.45 

5.19 

R6-FT 

0.51 

5.89 

E6-E8 

0.73 

7.73 


* 



F (3,143):: 3.92 


c. Classification According to Chemical Quality Parameters 


A data file (Table 2) that contained the chemical analysis of 49 non-classified 
samples of Flue-cured, Burley, Black air-cured, and Extra air-cured tobaccos, was 
analyzed by a non-hierarchical cluster method, in order to evaluate the ability 
of this technique to group samples of similar chemical characteristics, using the 
first four PC containing 97.8% of the total population variance (Figure 7). Five 
of the six formed groups correspond clearly with the five types of tobacco, and 
only one sample was classified as a different group. This technique has a great 
potential to classify tobaccos of unknown nature, according to their chemical 
quality parameters. 


CONCLUSION 


Multivariate techniques represent a more powerful tool than the ones that have 
been used (univariate statistics) for the evaluation of tobacco grading systems, 
crops comparison and determination of equivalences among tobaccos. Although 
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FIGURE 0 


2 

(23.7%) 


PRINCIPAL COMPONENT ANALYSIS OF FLUE-CURED TOBACCO 

GRADE8 OF 1900, 1907 AND 1900 CROP8 



FIGURE 6 

HIERARCHICAL CLUSTER ANALYSIS 

(NEAREST NEIGHBOUR - EUCLIDEAN DISTANCE) 
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this study only used chemical variables, the inclusion of other sensitive 
parameters, such as organoleptic and physical variables in addition to the 
chemical ones, would give a more complete information and strengthen this 
technique for the foregoing applications. 


FIGURE 7 
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